Fully Convolutional Attention Localization Networks: Efficient Attention Localization for Fine-Grained Recognition

نویسندگان

  • Xiao Liu
  • Tian Xia
  • Jiang Wang
  • Yuanqing Lin
چکیده

Fine-grained recognition is challenging due to the subtle local inter-class differences versus the large intra-class variations such as poses. A key to address this problem is to localize discriminative parts to extract pose-invariant features. However, ground-truth part annotations can be expensive to acquire. Moreover, it is hard to define parts for many fine-grained classes. This work introduces Fully Convolutional Attention Networks (FCANs), a reinforcement learning framework to optimally glimpse local discriminative regions adaptive to different fine-grained domains. Compared to previous methods, our approach enjoys four advantages: 1) the three components including feature extraction, visual attention and fine-grained classification are unified in an end-to-end system; 2) the weaklysupervised reinforcement learning procedure requires no expensive part annotations; 3) the fully-convolutional architecture speeds up both training and testing; 4) the greedy reward strategy accelerates the convergence of the learning. We demonstrate the effectiveness of our method with extensive experiments on four challenging fine-grained benchmark datasets, including Stanford Dogs, Stanford Cars, CUB-200-2011 and Food-101.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition

A key challenge in fine-grained recognition is how to find and represent discriminative local regions. Recent attention models are capable of learning discriminative region localizers only from category labels with reinforcement learning. However, not utilizing any explicit part information, they are not able to accurately find multiple distinctive regions. In this work, we introduce an attribu...

متن کامل

Where to Focus: Deep Attention-based Spatially Recurrent Bilinear Networks for Fine-Grained Visual Recognition

Fine-grained visual recognition typically depends on modeling subtle difference from object parts. However, these parts often exhibit dramatic visual variations such as occlusions, viewpoints, and spatial transformations, making it hard to detect. In this paper, we present a novel attention-based model to automatically, selectively and accurately focus on critical object regions with higher imp...

متن کامل

Fast Fine-grained Image Classification via Weakly Supervised Discriminative Localization

Fine-grained image classification is to recognize hundreds of subcategories in each basic-level category. Existing methods employ discriminative localization to find the key distinctions among similar subcategories. However, existing methods generally have two limitations: (1) Discriminative localization relies on region proposal methods to hypothesize the locations of discriminative regions, w...

متن کامل

Weakly-supervised Discriminative Patch Learning via CNN for Fine-grained Recognition

Research on fine-grained recognition has recently shifted from multistage frameworks to convolutional neural networks (CNN) that are trained end-to-end. Many previous end-to-end deep approaches typically consist of a recognition network and an auxiliary localization network trained with additional part annotations to detect semantic parts shared across classes. To avoid the cost of extra semant...

متن کامل

Integrating Scene Text and Visual Appearance for Fine-Grained Image Classification with Convolutional Neural Networks

Text in natural images contains rich semantics that are often highly relevant to objects or scene. In this paper, we focus on the problem of fully exploiting scene text for visual understanding. The main idea is combining word representations and deep visual features into a globally trainable deep convolutional neural network. First, the recognized words are obtained by a scene text reading sys...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1603.06765  شماره 

صفحات  -

تاریخ انتشار 2016